-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FIXED] Improvements to dealing with old or non-existant index.db #5893
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…g snapshot restore or restart. We had a condition where an old index.db was not able to properly restore a stream due to max msgs per subject being set and certain blocks being compacted away and removing subject info for those sequences. In addition we fixed recovery after Truncate and PurgeEx by subject when the index.db was corrupt or not available. This change also moves generating the index.db file to after writing the blocks during a snapshot and we do a force call to make sure it is written even when complex. Signed-off-by: Derek Collison <[email protected]>
neilalexander
approved these changes
Sep 17, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
derekcollison
pushed a commit
that referenced
this pull request
Sep 18, 2024
Extension to #5893 If we can't update the index.db upon shutdown, for example during a hard kill, we'd enter into this condition if `MaxMsgsPer` was set. https://github.com/nats-io/nats-server/pull/5893/files#diff-384c189826934c9a6fc3554dafc63dab2076245010e3d6fce5c71a93e15e9877R1752 However, all limits-based fields have this issue not just `MaxMsgsPer`. Running similar tests where `nats str info` before hard kill should equal its output after hard kill: - `MaxMsgsPer`: −7,877 diff (fixed with addition of above condition/PR) - `MaxMsgs`: +2,123 diff - `MaxAge`: no diff (correct messages, but still `[WRN] Filestore [stream] loadBlock error: message block data missing`) - `MaxBytes`: +3,567 diff (had a MaxBytes set of 1016 MiB, but after restart the state has more messages and Bytes: 1020 MiB) I think we shouldn't only target `MaxMsgsPer`, since other fields can also trigger this and making it specific to also include these other fields would come back to bite if we add other limits-based fields in the future and forget to add it in this condition. We need to detect index.db was not written during shutdown or there is a difference between index.db and our msg blocks. If we detect this we can't rely on it being correct still, so I'd propose to simplify and upon detecting defer to rebuilding. Signed-off-by: Maurice van Veen <[email protected]> --------- Signed-off-by: Maurice van Veen <[email protected]>
neilalexander
pushed a commit
that referenced
this pull request
Sep 20, 2024
Extension to #5893 If we can't update the index.db upon shutdown, for example during a hard kill, we'd enter into this condition if `MaxMsgsPer` was set. https://github.com/nats-io/nats-server/pull/5893/files#diff-384c189826934c9a6fc3554dafc63dab2076245010e3d6fce5c71a93e15e9877R1752 However, all limits-based fields have this issue not just `MaxMsgsPer`. Running similar tests where `nats str info` before hard kill should equal its output after hard kill: - `MaxMsgsPer`: −7,877 diff (fixed with addition of above condition/PR) - `MaxMsgs`: +2,123 diff - `MaxAge`: no diff (correct messages, but still `[WRN] Filestore [stream] loadBlock error: message block data missing`) - `MaxBytes`: +3,567 diff (had a MaxBytes set of 1016 MiB, but after restart the state has more messages and Bytes: 1020 MiB) I think we shouldn't only target `MaxMsgsPer`, since other fields can also trigger this and making it specific to also include these other fields would come back to bite if we add other limits-based fields in the future and forget to add it in this condition. We need to detect index.db was not written during shutdown or there is a difference between index.db and our msg blocks. If we detect this we can't rely on it being correct still, so I'd propose to simplify and upon detecting defer to rebuilding. Signed-off-by: Maurice van Veen <[email protected]> --------- Signed-off-by: Maurice van Veen <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
We had a condition where an old index.db was not able to properly restore a stream due to max msgs per subject being set and certain blocks being compacted away and removing subject info for those sequences. In addition we fixed recovery after Truncate and PurgeEx by subject when the index.db was corrupt or not available.
This change also moves generating the index.db file to after writing the blocks during a snapshot and we do a force call to make sure it is written even when complex.
Signed-off-by: Derek Collison [email protected]